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This article tabulates continuous probability density functions and discrete prob- 
ability mass functions which maximize the differential entropy or absolute entropy, 
respectively, among all probability distributions with a given L p -norm (i.e., a given 
p th absolute moment when p is a finite integer) and unconstrained or constrained 
value set. Expressions for the maximum entropy are evaluated as functions of the 
Lp-norm. The most interesting results arc obtained and plotted for unconstrained 
(real-valued) continuous random variables and for integer-valued discrete random 
variables. 

The maximum entropy expressions are obtained in closed form for unconstrained 
continuous random variabies, and in this case there is a simple straight-line relation- 
ship between the maximum differential entropy and the logarithm of the L ? -norm. 
Corresponding expressions for arbitrary discrete and constrained continuous ran- 
dom variables are given parametrically; closed-form expressions are available only 
for special cases. However , simpler alternative bounds on the maximum entropy 
of integer-valued discrete random variabies are obtained by applying the differen- 
tial entropy results to continuous random variables which approximate the integer- 
valued random variabies in a natural manner. 

Most of these results are not new. The purpose of this article is to present 
all the results in an integrated framework that includes continuous and discrete 
random variables, constraints on the permissible value set, and all possible values 
of p. Understanding such as this is useful in evaiuating tiie performance of data 
compression schemes. 
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I. Introduction 

The differential entropy h{x} of a continuous, real- 
valued random variable x with probability density f(x) 
is defined as 

/ oo 

f(x)\og[f(x)]dx (1) 

- OO 

For any positive (or infinite) integer p = 1,2,3, ... ,oo, 
define the I p -norm M p {x} of the random variable x as 

M P {x) = [E{|x| p }] 1/p 



Moo{z} = lim Mp{z} = ess sup|x| (2) 

/(*)> o 

The essential supremum in Eq. (2) is the smallest number 
that upper bounds |x| almost surely. 

Sometimes the real-valued random variable x is con- 
strained to lie within a subset S of the real line; in this 
case, the integrals in Eqs. (1) and (2) need only extend 
over the subset E. 

For a discrete random variable X with discrete value 
set E = {&} and probability mass function F(£i) t its (ab- 
solute) entropy H{X} is defined as 

H{X} = -E{log[F(X)]} = - F (&) l °g[ F (6)] (3) 

* 

The Lp-norm M p {X} of the discrete random variable X 
is defined as 

M P {X} = [E{\xn) 1,p 

■ 1/p 

. p= 1 , 2 , 3 ,... 

i 

Moo{X}= lim Mp{X} = sup |&| (4) 

p^°° F(c,)>o 

This article tabulates continuous probability density 
functions f{x) — f*{x\p.) or f(x) = / p *(x;/i,E) and 
discrete probability mass functions F(&) = 
which maximize the differential entropy h{x } or absolute 


entropy H{X}, respectively, among all probability distri- 
butions with a given Lp-norm M p {x} or M p {X} and un- 
constrained or constrained value set n. The most interest- 
ing results are obtained and plotted for unconstrained con- 
tinuous random variables and for integer-valued discrete 
random variables. Finally, alternative simpler bounds on 
the entropy of integer-valued random variables are ob- 
tained by modifying the bounds on differential entropy for 
unconstrained continuous random variables. 

Most of these results are not new. In fact, the maxi- 
mum-entropy continuous distributions for p— 1,2 (Lapla- 
cian and Gaussian distributions, respectively) have been 
known since Shannon’s original work [1]. The purpose of 
this article is to present all the results in an integrated 
framework that includes continuous and discrete random 
variables, constraints on the permissible value set, and all 
possible values of p. 

Throughout this article, regular italic notation is used 
for an ordinary function of a real variable, such as f(x) or 
F(G), while boldface notation is used for an operator ap- 
plied to a random variable, such as h{x} or H{X}, M p {x} 
or M p {A'}, or the expectation operator E{}. In order not 
to interrupt the main presentation, proofs of all stated re- 
sults are relegated to the Appendix. 

It. Effects of Elementary Transformations 

A scaled random variable x ' = qx or X* = qX , where 
q is a constant, has a correspondingly scaled Lp-norm: 

Mp{x'} = M m pM 

M p {X'} = \q\M p {X} (5) 

A discrete random variable X with value set E = {£,} 
scales to a discrete random variable X f with scaled value 
set gE = {?£»}. The entropy of a discrete random vari- 
able is unaffected by scaling, but the differential entropy 
of a scaled continuous random variable either increases or 
decreases: 

h{x'} = h{x} + log[|g|] 

H{X'} = H{X} (6) 

The change in the differential entropy of a scaled continu- 
ous random variable exactly equals the change in the log- 
arithm of its Lp-norm: 

h{x'} - h{x} = log[M p {x'}] - log[M p {x}] = log[|g|] 

( 7 ) 
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In contrast, the L p -norm of a discrete random variable 
can be made arbitrarily small or large without affecting 
its entropy, simply by scaling its value set. 

A shifted random variable x" — x — A or X n = X — A, 
where A is a constant, has the same differential or absolute 
entropy as the unshifted random variable, 

h{x ;/ } = h{x} 

H{A"}=H{A} (8) 

but a different T p -norm. A discrete random variable X 
with value set E = {£} shifts to a discrete random vari- 
able A" with shifted value set E — A = {£* — A}. A 
random variable x or X is centered with respect to the 
L p -norm if no shifted version has a lower L p -norm. A cen- 
tered random variable x p or A p can be obtained from an 
uncentered random variable x or X by applying an opti- 
mum shift A = A p . This optimum shift equals the median 
of the random variable for p — 1, the mean value of the 
random variable for p = 2, and the average of the essential 
infimum and essential supremum of the random variable 
for p — oo. The centered L p -norm M p {x} or M p {A r } of 
the random variable x or X can be defined as 

M p {x} = mm M p {x - A} = M p {x - A p } — M p {x p } 
M;{A} = mm M p { A - A} = M p { A - A* } = M p {A p 0 } 


The probability densities f*(x;p) for p — 1,2, oo are the 
well-known Laplacian, Gaussian, and uniform probability 
densities, respectively. 

The absolute moments of these random variables are 
known in closed form: 



n = 1,2,3 p= 1,2,3,... 

£{1*^)1"}=-^, «= 1,2,3,... (12) 

n+ 1 

Evaluating these expressions for n = p or n — ► oo yields 
the L p -norm M p (p) of the random variable x*(p): 

M;(p) = M p {x* p (p)} = p, p= 1,2,3,.. ..,00 (13) 

The differential entropy h*(p) of the random variable 
x p (p) is calculated as 

= log[2/i r(E±i)(p e ) 1 ^] , p = 1,2,3,... 

C(p) = h{^(/i)} = log[2p] (14) 


(9) Explicit formulas for p — 1,2 are 


III. Maximum Differential Entropy for 
Continuous Random Variables 

For any positive real number p and any positive (or 
infinite) integer p = 1, 2, . . . , oo, let x p (p) be a continuous 
random variable with probability density f*{x \p) } where 


h\(n) = log[2ep] 

h’ 2 (fi) = log[\/27re p] (15) 

Since from Eq. (13) the parameter p equals the L p -norm 
M*(p) for any p, the differential entropy can be related 
directly to the corresponding L p -norm: 


r./_. .A _ ex P(-M P /PV P ) _ , „ o 

fp{x\M= 7— ~Y , P = 1, 2, 3, . . . 

2pp 1/p r ( £ ± i ) 


— — , lari < p 

fo o(*;m) = < 

{ 0, |x| > p 


and r(-) is the gamma function. These probability densi- 
ties are all properly normalized, i.e., 

/ oo 

/ P *(x;p) dx = 1, p= 1,2,3,. ...oo (11) 

-OO 


*;(#*) = log [2 r(E±i)(pe) 1 / p + iog[Af;o*)], 

P = 1,2,3,... 

^(p) = log[2] + log[M^(p)] (16) 

The differential entropy h p (p) is plotted in Fig. 1 versus 
the logarithm of the corresponding L p -norm, log [M* (//)], 
for various values of p. Note that this is a simple straight- 
line relationship. In fact, the straight line has unit slope, 
assuming \og[M*(p)] is measured to the same logarith- 
mic base as h*(p). This is consistent with the previous 
observation in Eq. (7), because the scaled version of the 
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random variable x*(p) is statistically equivalent to the ran- 
dom variable with scaled L p -norm, i.e., 

qx’(n) <p> x* p (\q\fi) (17) 

If x is any continuous random variable with differential 
entropy h{x} and Lp-norm M p {x} = p, then 

h{x} < /i*(M p {x}) = li{x p (/i)}, p= 1,2,3,..., oo (18) 


where 


«*(/*, H) = exp( — \x\ v / pp p ) 

*'oo(n^)= [ ldx 

7nn{|r|<^} 


dx , p — 1, 2, 3, . . . 


( 23 ) 


The L p -norm M p (p,~) of the random variable x p (p,Z) is 
given by 


i.e., Xp(p) is the maximum- entropy continuous random 
variable with a fixed Lp-norm p. Since the bound in 
Eq. (19) must be valid for all values of p, 

h{x} < min h*(M p {x}) (19) 

p 

If the random variable x is not centered with respect 
to the Lp-norm, the centered random variable x° = x — 
has the same differential entropy as x but a smaller 
Lp-norm. The differential entropy of x may be more 
tightly upper bounded by applying the bounds in Eqs. (18) 
and (19) to the differential entropy of x° p \ 

h{x} = h{x;} < /i;(M p {x;}) = h' p (M° p {x}), 

p= 1,2,3,... ,00 (20) 


and 


M;(/i, E) = M p {x*(//, E)} 


= P 




V<(p,Z) 


, P= 1,2,3,... 


M^(p,Z) = M 00 {x‘ , 00 (p,Z)} = sup |x| 

l*l<^ 

x € H 


(24) 


idiere 


/?*(/«, E) = J(\x\ p /p p ) exp(—\x\ p /pp p ) dx, 

P = 1,2, 3, . . 


(25) 


The differential entropy h p (p, E) of the random variable 
x'(/i,E) is given by 


h{x}<minA;(Mf{*}) (21) 

P 

If the real- valued continuous random variable x is con- 
strained to lie within a subset E of the real line, its maxi- 
mum possible differential entropy is smaller than that cal- 
culated above for a random variable constrained only by its 
Lp-norm. Maximum-entropy distributions for constrained 
continuous random variables can be obtained as simple 
generalizations of the foregoing results. Let x p (p,E) be 
a continuous random variable with probability density 
f*(x;p,E) equal to the conditional probability density of 
Zp(p) given {*£(/*) 6 > e < 


/;(*;/*. H) 


/ exp( — \x\ p /pp p ) _ 

«;(#!. s) ’ ” 

l V =1,2,3,... 


0, 


x $ E 


- \ 


1 


, \x\ < p and x € E 


a£o0*-3) 

L 0, otherwise 


( 22 ) 


h p (p,=.) = li{ 1 p(/ i . S )} 


= log[a*(^,^,)] -t- 


log[e]/?;(/i,3) 
P «p(/l3) 


= log[op(/i,H)] + 


l°g[e] 

P 


- m;(p,Z) V 

p 


P= 1,2,3,... 


h^ip, E) = h{x^(p, 2)} = log[a^(/i, 2)] (26) 

The random variable x*(/i,E) is the maximum- 
entropy continuous random variable with constrained 
value set A" and fixed Lp-norm M*(/i,E), i.e., it x is any 
continuous random variable with value set — , differential 
entropy h{x}, and Lp-norm M p {x}, then 

li{x} < h{x p (p p ,E)} 

= h*(pp, E), p — 1,2,3, ... ,oo (27) 

where p p is chosen to match the L p -norm of x: 
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m;(h p ,e) = m p {z}, p = 1,2,3,.. ,,oo (28) IV. Maximum Entropy for Discrete 

Random Variables 


Since the bound in Eq. (27) must be valid for all values of 
P. 


h{x} < min/i*(/x p ,H) (29) 

P 


If the random variable x is not centered with respect 
to the Lp-norm, the differential entropy of x may be more 
tightly upper bounded by applying the bounds in Eqs. (27) 
and (29) to the differential entropy of the centered random 
variable x° p = x — A£: 

h{x} = h{x"} < h{x*(p£,S - A")} = hp(fi°, S- A"), 

p — 1, 2,3, . . . , oo (30) 


Discrete versions F* (&; //,E) of the probability densi- 
ties f*(x,p) can be defined in a natural manner for dis- 
crete random variables X p (p y 3) with discrete value set 

H ={£,}: 




E /;(6 ;p) 


_ exp(-|6| p /pp p ) 

A;(p,E) 


(35) 




KMill 0 
E/£>(6 ;p) 

i 


= < 




=T, 16 1 < P 


U, 


161 > P 


and 


where 


h{x} < rnin/i*(pp,H - A°) 


(31) 


^p(p,^) = X] exp (~l6 l7f , m< P = 1,2,3, .. . 


where p° p is chosen to match the L p -norm of x° v (i.e., the 
centered L p -norm of x ): 

h-a;) = m p {x;} 

- Mp{x}, p= 1,2,3 oo (32) 


A' 00 (p,E)= ]T 1 (36) 

The discrete probability mass function Fp(£ t ;p,E) equals 
the conditional probability mass function for the maxi- 
mum-entropy continuous random variable x*(p) } given 
(*p(p) 6 E}. 


Notice that the bounds on the right-hand sides of Eqs. (30) 
and (31) are calculated with reference to the shifted value 
sets E — not the actual value set E. 

The integrals defining c**(/i,E) and /?*(/i, E) are gener- 
ally not obtainable in closed form for an arbitrary value set 
E. An interesting exception is when the value set equals 
the positive half-line, i.e., E = R + = (0,oo). In this case, 


The Lp-norm M*(p y 3) of the discrete random variable 
X‘(p,E) is given by 

Mp*(p,E) = Mp{Xp*(/i,E)} 

P 


= P 


b;(p,s) 


a;(p,e) 


M;(p,R+) = M;( f i)=p, p= 1,2,3,..., oo (33) 


and 


AC(p, Z) = M 00 {X* 00 (p,Z)} = sup |fc| (37) 

l«.|<p 


h p(p,R + ) = h * P (v) -log[2], p= 1,2,3,.. .,oo (34) 

In other words, the maximum possible differential entropy 
for a positive- valued continuous random variable is exactly 
one bit less than the maximum differential entropy for a 
real-valued random variable with the same L p -norm. 


where 

b;(p,e) = ^(|6| p /p p )exp(-|6P/pp p ), 

i 

P= 1,2,3,... (38) 


78 



(44) 


The entropy H*(p,Z) of the discrete random variable 
A r p *(/i, E) is given by 

E) = H{A p >,E)} 

. r log[e] B p(P’~) 

= lo„[^0‘,=)]+— 

1 r rr -\i i log ^ m p^> e ^Y 
= loglApC/i, a)] + -y- y j - 

p= 1,2,3,... 

/C(p,E) = H{X^(p,E)} = log[^(p,E)] (39) 


g) 

A* 


The random variable A*(p,E) is the maximum- 
entropy discrete random variable with value set and 
fixed Lp-norm M*(/i,E); i.e., if X is any discrete random 
variable with value set E, entropy H{AT}, and L p -norm 
M P {X}, then 


H{X} < H{A*(/Jp,E)} 

= U;(p p , E), p= 1,2,3,..., oo (40) 

where p p is chosen to match the Lp-norm of A': 

M*(/i p ,E) = M P {A), p = 1,2,3, ... ,oo (41) 

Since the bound in Eq. (40) must be valid for all values of 
P, 


H { A } < min H* (p p , E) (42) 

P 

If the random variable X is not centered with respect to 
the Lp-norm, the centered random variable X° = X — A£ 
has the same entropy as X but a smaller Lp-norm. The 
entropy of X may be more tightly upper bounded by ap- 
plying the bounds in Eqs. (40) and (42) to the entropy of 


and 


H{A } < min H* (p°, E — A p ) 

where p £ is chosen to match the Lp-norm of X° (i.e., the 
centered Lp-norm of X): 

a° p ) = m p {x;} 

= M ;{X}, p = 1,2,3,. ..,oo (45) 


Notice again that the bounds based on centered random 
variables are calculated with reference to the shifted value 
se ts E — A£, not the actual value set E. An exception 
for which the centering operation leaves the value set un- 
changed (i.e., E - Ap = E) occurs for the value set E = / 
(defined below) or, more generally, for any scaled version 
of it, E = ql , as long as the allowable centering shifts A£ 
are constrained to multiples of the scale quantum q. 

For many applications, the most interesting discrete 
value sets are the set of all integers 1 = {0, ±1 , ±2, ±3, . . .} 
and the set of positive integers 7 + = {1,2,3,...}. The 
maximum entropy for integer-valued random variables, 
H*(p f /), is plotted in Fig. 2 versus the logarithm of the 
corresponding Lp-norm, \og[M* (p, /)], for various values 
of p. Notice that the nonlinear relationship for integer- 
valued random variables becomes essentially linear when 
the Lp-norm is large compared to the (unit) interval be- 
tween successive values in the value set I . In fact, all of 
the curves in Fig. 2 converge to the corresponding straight- 
line curves in Fig. 1 in the limit of large Lp-norm. Notice 
also how the continuous curves for large values of p < oc 
approach the limiting staircase curve for p = oo. The max- 
imum entropy curve for p — oo takes quantum jumps at 
integer values of the Loo-norm. 

Closed-form maximum-entropy expressions as a func- 
tion of Lp-norm can be obtained for discrete random vari- 
ables in only a few special cases. Interesting cases include 
p = 1, oo, for value sets E = /, / + : 


h{x} = h{x;} 

< H{Xp(/ip, E - A^)} 

= H;(p° p1 E-Ap, p= 1,2,3,. ..,00 (43) 


H*{p> I) = log 




+ I) log 




x/TTp?*oroF-iJ 
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= Mi* (/<-/) 


x = A r + u 


(49) 


x H 2 


i + A/r (/*,/)- y/r+wfwiw 


1 + A/r(/i, /) + y/l + /)] 2 

2 


where u is a uniform (continuous) random variable over 
[—1/2, 1/2] which is independent of X. The probability 
density function f(x) of the continuous random variable 
x is related to the probability mass function F(X) of the 
discrete random variable X as: 

/(x) = F(|x+l/2j) (50) 


Ho 


0 + [A/ 1 *(,<,/)] 2 -M 1 ’(/U) 


n = log(2LAC(p, I ) J + 1) (46) 


where |_£ + 1/2J maps x to the nearest integer. The dif- 
ferential entropy of x equals the absolute entropy of A", 

i-e., 


h{x) = H{A} (51) 


and their L p -norms are related as follows: 


n;(n,i + ) = M[(pj + ) H 2 


lA/r(/i,/+) 




(47) 


where [a] is the integer part of a and H 2 [a] is the binary 
entropy function, 


H 2 [a] 


— alog[a] — (1 — a)log[l — a], 0 < a < 1 

0, a — 0 or a — 1 

(48) 


( p-} 


[M p {x}] p = { 


[m p .. (.v)i-o^i 


2 ~P 


+—jm, p = i,3,5,. 

p + 1 


p ~ 2 /p\ 2“ r 

E [M,-,{A')F- r (J) 


2-p 

p + 1 5 


p = 2,4,6,. 


V. Alternative Entropy Bounds for 
Integer-Valued Random Variables 

The maximum-entropy discrete distributions are not as 
useful as the maximum-entropy continuous distributions 
for unconstrained value sets, because closed-form results 
determining the maximum entropy for a given L p -norm are 
available only in special cases. Alternative bounds on the 
entropy of discrete random variables can be obtained by 
approximating their discrete probability distributions with 
continuous probability densities and applying the simpler 
bounds on the differential entropy of continuous random 
variables. In this section, entropy bounds of this kind are 
obtained for integer- valued random variables (E = I). 


Moo{x} = M co {A} + 1 (52) 

Explicit formulas for p = 1,2, 3, 4, are: 

M 1 {x} = M 1 {A}+If(0) 

[M 2 {x}] 2 = [M 2 {A}] 2 +E 

[M 3 {x}] 3 = [M 3 {A}] 3 + 1m, {A} + ^F(O) 


Associate with any integer-valued random variable A" a 
corresponding continuous random variable x defined by 


[M 4 {x}] 4 = [M 4 {A}] 4 + 1[M 2 {A}] 2 + 1 (53) 
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The entropy of the integer-valued random variable X 
is upper bounded by 

H{X} = h{x} <A;(M p {*}), p= 1,2,3, ... ,oo (54) 
Explicit bounds for p = l,2,oo, are: 


H{X}< log[2e] + log 


M!{X} + iF(0) 


H{X} < \og[\/2ne] + ^ log 


[M 2 {X)f + - 


H{X } < log[2] + log 


Moo{*} + 5 


(55) 


Since the bound in Eq. (54) is valid for all values of p, 

H{X) < mmh;{M p {x}) (56) 

p h 

The bound in Eq. (54) is not quite as tight as the achiev- 
able bound given earlier in Eq. (40), because the step- 
wise constant probability density of x = X + u given by 
Eq. (50) cannot exactly equal the maximum-entropy con- 
tinuous probability density specified by Eq. (10). However, 
a stepwise-constant approximation can be very accurate 
when the probability distribution is much wider than the 
unit step width. 


VI. Summary and Potential Applications 

This article has tabulated continuous probability den- 
sity functions f(x) = f*(x\p) or f(x) = / p (x;p,H) and 
discrete probability mass functions F(&) = F p (&;//,E) 
which maximize the differential entropy h{x) or absolute 
entropy H{X}, respectively, among all probability distri- 
butions with a given L p -norm M p {r} or M p {X} and un- 
constrained or constrained value set El. Expressions for 
the maximum entropy are evaluated as functions of the 
Lp-norm. These expressions are obtained in closed form 
for the case of unconstrained continuous random variables, 
and in this case there is a simple straight-line relation- 
ship between the maximum differential entropy and the 
logarithm of the Lp-norm. Corresponding expressions for 
discrete and constrained continuous random variables are 
given parametrically; closed-form expressions are available 
only for special cases. However, simpler alternative bounds 
on the maximum entropy of integer-valued random vari- 
ables are obtained by applying the differential entropy re- 
sults to continuous random variables which approximate 
the integer- valued random variables in a natural manner. 


The results tabulated here have at least two potentially 
useful applications. First, they can lend a theoretical un- 
derpinning to source coding distortion measures based on 
Lp-norms. Second, they can be used to perform estimates 
of the local entropy of a dataset, for which the available 
local data are sufficient for obtaining good estimates of the 
dataset’s L p -norm but not for a good estimate of its his- 
togram. Follow-up articles on these two applications will 
appear in future issues. 
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Fig. 1. Maximum differential entropy as a function of L p ~ norm Fig. 2. Maximum entropy as a function of Lp-norm (p = 1, 2, 3, 

(p = 1, 2, 3, 4, 5, 6, 8, 10, 12, 16, oo) for unconstrained continuous 4, 5, 6, 8, 10, 12, 16, oo) for integer-valued random variables, 

random variables. 
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Appendix 


This appendix contains proofs or derivations omitted in the main text. Equations (1), (2), (3), (4), (9), (10), (22), 
(23) (25) (28), (32), (35), (36), (38), (41), (45), (48), and (49) are definitions and require no proof. Equations (7), (16), 
(17)| (19)] (20)’, (21), (24), (29), (30), (31), (37), (42), (43), (44), (53), (54), (55), and (56) are trivial or straightforward 
applications of preceding results. This leaves Eqs. (5), (6), (8), (11), (12), (13), (14), (15), (18), (26), (27), (33), (34), 
(39), (40), (46), (47), (50), (51), and (52) requiring further justification. 

Equation (5) follows from the linearity of the expectation operator. Equations (11) and (12) come from standard 
integral tables [2]. Equations (13) and (15) require two elementary properties [2] of the gamma function: T(1 + 1/p) = 
T(l/p)/p and T(l/2) = y/n. Equations (6) and (8) result from applying the definitions in Eqs. (1) and (3) to the 
probability distributions of scaled and shifted random variables, obtained from standard texts [3] as. 


/'(*') = /(*7?)/l?l 

F'(X') = F(X'/q) 


/"(*") = /(*" + A) 
F"(X") = F( X" + A) 


(A- 1) 


where F'(X'), /"(x"), and F"( X") are probability density or probability mass functions for the scaled and 

shifted random variables x', X\ x", and X" . 

Equations (14), (26), and (39) follow after observing that the logarithms of the probability distributions in Eqs. (10), 
(22), and (35) all consist of two terms, one term a constant and the second term proportional to |x| p or The 

expected value of the second term can thus be calculated directly from the preceding formulas, Eqs. (13), (24), and (37), 
for the Lp-norm. 

Equations (18), (27), and (40) are the central results of this article and are proved by generalizing a technique used in 
[4] to show that maximum differential entropy with constrained second moment is achieved by a Gaussian distribution. 
If x and x*(p,E) both have Z, p -norm M p {x}, then for p < oo, 


li{x’(p, E)} = -^/*(x;p,S)log[/ p *(x;p,E)] dx 


= / = /;(x ; p,S){l° g [a>,S)]+^ M- 

= X / ( x ){ iog [ Q '^( /i,H )] + ^p^ dx 

= - f f(x)\og[f*(x;n,Z)] dx 


dx 


(A-2) 


The third equality in Eq. (A-2) follows from the assumption that x and x*(p,E) have identical L p -norms, hence |x| p 
has the same expectation whether it is averaged over //(x;p,d) or /(x). If p — oo, the same result holds, the second 
term in the second and third lines of Eq. (A-2) is absent, and the integration over E is replaced by an integration over 
Efl { |x| < p}. Continuing, 


h{x*(p, E)} — h{x} = - / /(x)log[/*(x;p,E)] dx + 

f( x ) 


1 /O)>o} 


/(x) log[/(x)] dx 


l 


sn{/(i)>o) 


/(x)log 


L/-(ar;pr,H) 


— dx 


> I /(x) log[e] 1 1 - /p ~77~y ~~ \dx = 

xHn{/(r)>0} l J\ X ) ) 


(A-3) 


83 



The inequality in Eq. (A-3) results from the general inequality log[a] > log[e](l — 1/a) for all a > 0, and the last equality 
arises becauses E) and f(x) both integrate to one. 

The derivation in Eqs. (A-2) and (A-3) proves Eq. (27). Equation (18) is a special case of Eq. (27) obtained by 
setting E equal to the set of all real numbers. Equation (40) is derived in a similar manner by replacing the integrals 
in Eqs. (A-2) and (A-3) with summations and continuous probability density functions with discrete probability mass 
functions. 

Equation (33) results from noting that | x*(fi, R+)\ <=> |.r*(//)|, so the L p -norms of £*(//, R+) and x*(fi) must be 
identical. Equation (34) comes from the fact that the constant scale factor a*(n,R+) for /*(£;//,/£+) in Eq. (22) with 
E — R+ is exactly half the corresponding scale factor for in Eq. (10). This accounts for a difference of log[2] in 

the first terms in their respective expressions for differential entropy. The second terms must be equal by the previous 
observation linking them to their respective L p -norms. 

To derive Eqs. (46) and (47), let a — e -1 ^ and replace E with I or J + in Eqs. (36), (37), and (38) to obtain 


CO 

i~ — co 


CO 


2 E a, -i = 

t=0 



1 +a 


1 - a 




oo 


E w™" 


oo 

= 2 * a ' 

i=0 


2a 

(T^F 


kx* n = E 1 = 2M + 1 

l*l<^ 

AC (/*,/) = sup |*| = l/ij (A-4) 

|i|<P 

and 


A\(n, I + ) = E e_ l' l/M = Ys 

1=1 1=1 


1 

1 — a 


- 1 = 


a 

1 - a 


I+ ) = E = E ia * = 

1=1 i=i ^ } 

= E 1 = M 

M ^J + )= SU P 1*1 = |pJ (A-5) 

1 <i<^ 

where (_/iJ is the integer part of (i . The entropy expressions in Eqs. (46) and (47) follow algebraically upon substitution 
of Eqs. (A-4) and (A-5) into Eqs. (37) and (39) and solving for the entropy in terms of the corresponding Lp-norm. 

Equation (50) results from calculating the conditional probability density f(x\X) of x given X , then averaging over X: 
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f(*\x) 


1, if \x - X\ < 1/2 
0, if |* - X\ > 1/2 


1, if X = [x + 1/2J 

0, ifX#|*+l/2J 


/(*)= J2 F(i)f(x\X = i) = F{[x + l/2\) 


Equation (51) results from breaking up the defining integral in Eq. (1) for the differential entropy into a sum of integrals 
over unit intervals, 

yOO °°^ fii+X/2 

h{x} = - f(x)\og[f(x)}dx = -S2 / f(x) log[/(x)] dx 

J - oo Ji- 1/2 


00 y * + 1 / 2 ” 

/ F (*) i °«[ /r ( < )] dx = 

i = -oo — 1 / 2 i=-oc 


F(i)log[F(i)] = H{A'} 


Equation (52) is derived by considering the cases of even and odd values of p separately. In the first, case, when p is even, 

E{\X + «n = E{(X + u)”} = J2 (J!) E{X p - r }EK) 

r = 0 

= t 0 E{x, " ] 7Ti ,A ' 8) 


because 


E{u r } = 


if r is even 


0, if r is odd 


Thus, since X p ~ r = \X\ P r when p and r are both even, 


E <i* + "i p > = !rr + £ (n^T E,|xrr) 


In the second case, when p is odd, the derivation begins by writing 


\X + u\ = |X|+ w 
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-Hu, A > 1 


w — < 


M, X = 0 
l -u, X < 1 


(A-12) 


This decomposition is valid because X is integer valued. The conditional moments of w are 


E{w r \X}={ 


r + 1 


if r is even or if r is odd and X = 0 


(A- 13) 


v 0, if r is odd and X / 0 


Thus, 


E{\X + U |P} = E{(|*| + ™) p } = eK:Q |A-r r « 


,r=0 


= E E 0 l*T r E{« r |X = i} + F(0)E{ W p |A = 0} 


0 r = 0 




:?£0 r = 0 

r even 


P-1 


E 


=0 
r even 


t#0 


P-1 




r = 0 
r even 


(A-14) 
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